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We introduce a new domain wall operator that represents a full (real) Mobius transformation of a given non- 
chiral Dirac kernel. Shamir's and Chiu/Borici's domain wall fermions are special cases of this new class. By 
tuning the parameters of the Mobius operator and by introducing a new Red/Black preconditioning, we are able 
to reduce the computational effort substantially. 



1. Introduction 

The key idea in evading the Nielsen-Ninomiya 
no-go theorem |1I2| . which forbade the construc- 
tion of lattice fermion actions with chiral sym- 
metry under rather general conditions, was in- 
troduced by Kaplan P). In his construction four 
dimensional chiral zero modes appeared as bound 
states on a mass defect or 3-brane in a five dimen- 
sional theory. Much like the work of Callan and 
Harvey 0] in the continuum, anomalous currents 
in the 4 dimensional theory are understood as the 
flow on or off the mass defect of conserved 5 di- 
mensional currents. This work led to two concrete 
realizations of lattice fermions with chiral symme- 
try. The domain wall fermions |5lo'l7l8| and the 
overlap fermions |9ll()lllll2ll3j . 

Here we introduce the Mobius domain wall 
operator, a generalization of Shamir's and 
Chiu/Borigi's suggestions. It is given by (to keep 
the notation simple, we choose the length of the 
fifth domain wall dimension L s equal to 4) 

D DW (m) = (1) 
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denotes the Wilson Dirac matrix 

D W (M 5 ) = (4 + M 5 )S x , y 7m) (4) 

U^dx+^y + (1 + 7p)^(2/)^,a+J • 

Note, that this choice is not mandatory, but that 
any other Dirac operator could have been used 
here as well. 

Eq.Q is a generic expression for the domain 
wall fermions. The different operators, Shamir, 
Chiu/Borigi and Mobius, are characterized by the 
coefficients &i,c; in eq.0. The Mobius operator 
contains Shamir's and Chiu/Borigi's suggestions 
as special cases. For Mobius, the coefficients are 
only constrained to be: 

• bud E 5ft , 



E L s (i.e. const is 
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• bi — Ci — const, V i 
independent of i). 

Shamir and Chiu/Borigi use: 

Shamir: bi = a, Cj = 0, 
Borigi: bi = a, Ci = a, 
Chiu: bi = cii, Ci = a.i. 

with a, cii E SR. 

To understand the meaning of the coeffi- 
cients, we will translate the 5 dimensional domain 
fermions into a 4 dimensional overlap operator. 
This is done via a linear matrix transformation. 
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2. Domain wall - overlap transformation 



The matrix entries are denned as follows: 



The domain wall and the overlap operator are 
connected through a linear matrix transformation 
|14llBllfl| . The length of the fifth domain wall 
dimension corresponds to the order of a polyno- 
mial, that approximates the sign function on the 
overlap side. Accordingly, domain wall fermions 
can be seen as a preconditioning of the overlap 
operator. 

In the following, Dow will denote the generic 
domain wall operator and L s the length of the 
fifth domain wall dimension. Here we will choose 
L a = 4 to keep the notation simple, but all for- 
mulas hold for any L s . Dov will denote the ap- 
proximation to the overlap operator, as defined 
through the polynomial of finite order, that de- 
scribes the sign function. 

The domain wall - overlap transformation 
reads: 



LD DW (m) R = FD 5 ov ( 
with 
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T~ is called the transfer matrix. 

Multiplying the matrices on the left hand 
side of eq.® (multiply first by P, then L and 
i?, where it might be useful to remember that 
(&P_ + cP+y 1 = |P_ + iP+) leads to the entry 
[LD DW (m)R] n = -{P--mP + ) + S(P+-mP-) 1 
or 
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F>ov(m) = - I 1 + m+ (1 - 771)75 

with S — S1S2S3S4 (note, that the fact that there 
are four factors Si , ■ ■ ■ , S4 is due to our choice 
L s = 4). If (S — 1)/{S + 1) was an approxima- 
tion to the sign function, ea. l|lt)|) would be the 
corresponding approximation to the overlap op- 
erator. To see whether there is such a relation, 
we define Hj! through: 



Si = Ht 
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A = [H<p + 1){hP + 1){H^ + 1)(H^ +1), (22) 
1)(H^-1)(H^-1). (23) 
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One can choose the coefficients hi and Cj such 
that eg. (|21jl corresponds to an approximation e 
to the sign function with kernel Ht- As men- 
tioned above, we set bi — Ci equal to a constant 
value for all i, i.e. the denominator is independent 
of i. 

Possible polynomial approximations are: 

• Ci+bi — const, Vi £ L s . This corresponds 
to Neuberger's polar decomposition. 

• Ci + bi equal to Zolotarev's coefficients 8 : . 
We can summarize this findings as 

Dov{m) = 1 (1 + m + (1 - m) 75 e(H T )) . (24) 

3. Domain Wall preconditioning 

Here we will describe how the quark propagator 
can be determined via the 5 dimensional domain 
wall operator. 

We are interested in the 4 dimensional propa- 
gator xi (corresponding to the source b), given 
by 



DqvXi = b. 



(25) 



Eq. © can be used to precondition eq. . As 
it stands, eq.© is not suitable for this task. We 
therefore perform the following simplifications: 

• Multiply eq.© by F^ 1 from the left 
R- 1 p- 1 D D 1 w (l)D DW {m)PR = Dqv, (26) 

• then multiply by R from the left and by 
R^ 1 from the right 

P^D^i^Dow^P = RDqvR- 1 . (27) 

The right hand side of ea. (|27|l is then given by 
(with d = P+- mPJ) 
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x\ is still the solution of Dov%i — b. Equiva- 
lently, we can find x\ by solving the left hand 
side of eq.(j23 
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\X A J 

Typically, in linear system solvers, one iterates 
DDw{m)y = b alone, i.e. without the matrix P. 
Therefore, one has to reconstruct the real solution 
X\ as 






V o / 



X\ 
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This follows directly from Dpw 
i.e x = P~ x y . 

4. The sign function 



For later reference, we will state here a few sim- 
ple properties of the sign function. 

The sign function satisfies the following equa- 
tion 

sign(:r) = sign(Ax), Vi € , A e 5R+. (33) 

Let e be a polynomial approximation to the 
sign function. For e, ea. (|33|) doesn't hold, instead 
we have 



e(x) ^ e(Xx). 



(34) 



Ea. H34|) can easily be understood by looking at 
n g-Q- There, we show the quality of the polar 
decomposition, by plotting its deviation from the 
sign function. Obviously, the approximation is 
best for i » 1, The factor A slides this curve 
along the abscissa. 

For the overlap operator fea. (|16fl ) this means: 
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Figure 1. e(x)is given by the polar decomposition 

_ (l+a:) 16 -(l-^) 16 



of order 16: e(x) 



(l+xye + il-x) 1 



• Eq. (|33|l states that scaling the kernel Ht is 
a valid operation. 

• Eq. H34f) demonstrates that the quality of the 
approximation to the sign function depends 
on this scaling. In other words, there is an 
optimal scaling factor. 

5. Comparison of the different domain wall 
fermions 

As demonstrated earlier, the kernels of the do- 
main wall actions are given by (see ea. (|17|l 'l 

Mobius: {b, + c,) l5 D w (2 + (b t - c^D^ 1 (35) 
Shamir: a^D w (2 + aDw)^ 1 , (36) 
Borigi: aj 5 D w , Chiu: ai^D w . (37) 

In the following, we will show how the deter- 
mination of the quark propagator depends on the 
choice of the coefficients bi, Ci (and hence on a, at 
as well) . We will describe the advantages of the 
Mobius operator as compared to Shamir's and 
Chiu/Borigi's suggestions. 

5.0.1. The Mobius operator 

As can be seen in ea. (!35|l . the Mobius kernel 
can be scaled with the coefficient bi + Ci. In other 
words, the eigenvalues of the operator j 5 D w (2 + 
(bi — c^Dw)^ 1 can be slided along the abscissa 



until the approximation to the sign function is 
optimal (with given L s ). 

The coefficients bi — Ci in the denominator are 
different. They don't simply act as scaling fac- 
tors, but change the spectrum of the operator. 
In other words, for each bi — Ci one has a differ- 
ent matrix. Accordingly, one can use bi — Ci to 
tune the condition number of the kernel operator. 
The smaller the condition number, the better the 
approximation to the sign function will be. 

As mentioned above, the denominator will al- 
ways be chosen independent of i. Not so for bi+Ci. 
We have L s different coefficients. This freedom 
allows for different choices of polynomials that ap- 
proximate the sign function on the overlap side, 
such as the polar decomposition or Zolotarev's 
polynomials (see ea. H22(l and ea. 1)23(1 ^). 

5.0.2. Shamir's operator 

Shamir's operator allows for a tuning of the 
condition number, since it possesses a coefficient 
a in the denominator. On the other hand, the 
same coefficient a acts as the scaling factor in 
the numerator. Therefore this operator can- 
not be scaled without changing the matrix itself. 
In other words, in the two dimensional space, 
spanned by the coefficients bi + and bi — Ci, 
Shamir's operator can only exploit the diagonal. 

It follows that in this case the only possible 
polynomial approximation on the overlap side is 
Neuberger's polar decomposition. 

5.0.3. Chiu/Borigi's operator 

Chiu/Borigi's action has independent coeffi- 
cients di that act solely as scaling factors. On 
the other hand, the denominator is constant 
(equal to 2). Therefore the condition number for 
Chiu/Borigi's operator can not be tuned. Note, 
that this operator correspond to the standard 
overlap approach, which employs Dirac fermions, 
with a denominator equal to the identity. 

5.0.4. Conclusions 

Mobius fermions are a best of two worlds 
approach. They combine Shamir's tuning of 
the condition number with the scalability of 
Chiu/Borigi's action. Our results will demon- 
strate that this leads to a significant reduction 
of the computational costs. 



5 



6. Red/black preconditioning 

The standard red/black preconditioning is only 
applicable for Shamir's action. We therefore in- 
troduce a new red/black partitioning that can be 
employed for the Mobius operator (and hence for 
Shamir and Chiu/Borigi as well). 

The two preconditioning methods are defined 
as follows: 

• Standard red/black: every neighbour of a 
black point is red. 

• New red/black: every space-time neighbour 
of a black point is red, every neighbour in 
the fifth dimension of a black point is black. 

A matrix M, acting on a vector x, can then be 
written as: 



Mx = 



M rr M, 
M hr Mi 



rb 



bb 



X b 



(38) 



Red/black preconditioning of this matrix is then 
defined through 





MbrM-^Mrb 



(39) 



with 
L 



-M hr M-} I, 



hi, 



,R= 







I, 



hi, 



I is the identity operator. In principle, the two 
sets (red and black) can be chosen freely. But 
from a practical point of view, M„ has be a 
simple matrix, since it has to be inverted in each 
iteration step of the linear system solver. 

To keep notation simple, let's define for the 
Wilson operator (eq.Q) m,D as D W (M^) = 



rhS x 



1/2D. For standard red/black precon- 



ditioning we find 



M, 



standard 



(40) 



i£) (2) P+ b 2 m+l 



-i£)( 2 )p_ 



V 



This matrix is computationally too costly, due to 
the off diagonal terms CiD (note that for Shamir's 
operator c; = 0). 



For the new preconditioning method, on the 
other hand, we find 



(41) 



(c 2 m-l)P+ b 2 m + l (c 2 m-l)P- 



V 



M^r w only contains coefficients and the chiral 
projectors P± and thus can be inverted analyt- 
ically. Therefore, its cost in the linear system 
solver is negligible. 

The standard preconditioning, which can be 
applied to Shamir's operator, results in a nu- 
merical speed up of roughly 2.6. We find the 
same acceleration for the Mobius operator with 
the new preconditioning method. Both methods 
are therefore equivalent in terms of convergence, 
whereas only the new approach is generally ap- 
plicable. 

7. Results 

We will present our results for the Mobius oper- 
ator. As our measure of performance we count the 
number of Wilson Dirac applications that the lin- 
ear system solver, the conjugate gradient method 
on the normal equation D^ dw Ddw , needs to con- 
verge (note, that even though the Mobius oper- 
ator contains three Dirac matrices per row, see 
(eq.Q), whereas Shamir only one, both opera- 
tors only require L s Wilson Dirac applications 
per Ddw application). The quality of the ap- 
proximation to the sign function is measured via 
the residual mass |17ll8ll9j . We will find that 
Mobius' more general set of coefficients, as com- 
pared to Shamir and Chiu/Borici, leads to a sub- 
stantial reduction of the computational effort. 

We perform our measurements on 20 quenched 
16 3 x 32 gauge fields, generated with the Wilson 
action at f3 — 6.0. 

Our point of reference will be Shamir's operator 
with a widely used set of parameters, M5 = 1.8 
and bi — Cj = 1.0. We will refer to this setting as 
'standard Shamir'. 

For each set of parameters, we have to tune the 
quark mass m, such that the pion mass agrees 
with standard Shamir. Since we find the residual 



6 



m res . M 5= 1 -5. L s=8 
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Figure 2. 'Standard Shamir' with M5 = 1.8 and 
h- Ci = 1.0 



mass of Mobius with L s — 8 to be roughly equal 
to standard Shamir with L s = 16, we adjust the 
pion mass such that it agrees for this two cases. 
The pion mass dependence on the scaling factor 
bi + Ci is weak and will therefore be neglected. 

In all graphs, the 'number of Dirac applica- 
tions' is normalized such that it represents L s 
times the number of iterations per source. Hence, 
we neglect a factor two, due to the normal equa- 
tion, D' DW D DW . 

7.1. Figure El 

Fig. (J2J shows the L s dependence of the residual 
mass for standard Shamir. We will compare all 
our results to this data. 

7.2. Figure El 

In fig.©, we present our results for the Mobius 
operator at L s — 8 and M5 = 1.5, but various 
hi — c%. We use a Shamir quark mass of m = 
0.06, which corresponds to a pion mass in lattice 
units of TO?,- = 0.44. The series of points, for a 
given bi — corresponds to different values of 
bi + Ci. It is important to note that we choose all 
scaling coefficients to be the same, i.e. b% + c\ = 
bi + Ci, V i G L s . In other words, we choose 
the polar decomposition to approximate the sign 
function. 

In general, the number of Dirac applications 



Figure 3. Comparison of the Mobius operator 
with standard Shamir at m q — 0.06 and M5 = 
1.5 (except Shamir, which uses M5 = 1.8). The 
series of points for a given bi — Ci corresponds to 
different values of b t + c^. Neighbouring points 
represent bi + Ci values which differ by 0.1. The 
bi + c-i values are, [Borici: bi + Ci = 1.0, . . . , 1.3], 
[bi -Ci = 0.75 : h + Cl = 2.2, . . . , 2.6], [h - a = 
1.0 : b,+c t = 2.2,..., 2.5], = 1.25 : h+a = 

2.0,..., 2.4], , [bi-a = 1.5 : b.+c, = 1.8, . . . , 2.2], 
[bi - Ci = 1.75 : h + Ci = 1.7, . . . , 2.0], [h - c l = 
2.0 : bi + Ci = 1.6, 1.9, 2.0]. The optimal b t - c» 
values are: Chiu/Borici: bi + ci = 1.2, bi — Ci = 
0.75 : bi + Ci = 2.5, bi - a = 1.0 : 6 l + c A = 2.4, 
bi - Ci = 1.25 : bi + c, t = 2.3, , 6,- — c,- = 1.5 : 
bi + Ci = 2.1, b, - a = 1.75 : bi + a = 1.9, 
h -a = 2.0 : h + a = 1.6. 

increases with bi + Ci. For bi + Cj = 1.75 this be- 
haviour starts to change. At bi + Ci = 2.0 it is 
even reversed and the number of Dirac applica- 
tions falls with growing bi + . 

7.3. Figure |U 

In fig.(@J, we analyse the dependence of 
Mobius' residual mass on M5, see eq.(@J. We use 
the optimal bi — Ci — 1.5, as determined in the 
analysis in fig.@. Again, the number of Wilson 
Dirac applications increases with bi + Ci. We find 
that the optimal M5 values are -M5 = 1.4 and 
M5 = 1.5, where M5 = 1.4 reaches smaller resid- 
ual masses, but A/5 requires less Wilson Dirac 
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m res. b r Ci=1 .5, L s =8 
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Figure 4. M 5 dependence of the Mobius operator, 
with optimal bi—Ci as determined in fig.©. A/5 = 
1.4 and M5 = 1.5 are optimal, where M5 = 1.4 
reaches smaller residual masses, but M5 requires 
less Wilson Dirac applications. The bi + Ci values 
are, [M 5 = 1.3 : h + a = 2.9, . . . , 3.2], [M 5 = 
1.4 : b, + c t = 2.0, . . . , 2.6], [M 5 = 1.5 : h + c t = 
1.8,..., 2.2], , [M s = 1.6 : b, + c, = 1.6, . . . , 1.9]. 



applications. 

As can be read off from the abscissa, for the 
optimal bi — Ci and M5 values, the Mobius oper- 
ator is roughly two times cheaper than standard 
Shamir. 

7.4. Figure |5l 

In fig.©, we show how Mobius' residual mass 
behaves for larger L Sl with bi — Ci — 1.5 and 
M5 = 1.5. Obviously, the relative improvement 
over standard Shamir grows rapidly. 

7.5. Figure |6] 

In fig.©, we consider the behaviour of Mobius' 
residual mass for the smaller standard Shamir 
quark mass m = 0.02, with A/5 = 1.5 and 
bi — — 1.0. Two facts are worth mentioning. 
Firstly, compared to the analysis with m — 0.06, 
the factor of improvement for these M5 and bi — Ci 
values, increases from 1.6 to 1.7. This means that 
the advantage of Mobius over standard Shamir 
grows with falling quark mass. Secondly, the op- 
timal bi + Ci is equal to 2.4, both for m = 0.02 



Figure 5. L s dependence of the residual mass, 
with bi — Ci = 1.5 and M5 = 1.5. It can be seen 
that Mobius is numerically much better behaved 
than standard Shamir. 

and m = 0.06. This suggest, that the tuning of 
the Mobius operator can be performed at heavy 
quark masses, where the computation of the prop- 
agators is less expensive. 

7.6. Zolotarev coefficients and figure 

As mentioned above, the scaling coefficients 
bi + Ci can take L s different values. The optimal 
choice for the approximation to the sign func- 
tion are the Zolotarev coefficients [S]. In other 
words, of all possible choices for the coefficients, 
Zolotarev will achieve the smallest residual mass. 
In fig.0 we show results that employ Zolotarev's 
coefficients at L s — 10. We compare with the 
polar decomposition (i.e. all coefficients equal) 
at L s = 16. The L s are chosen such that the 
two polynomials overlap as well as possible. The 
graph illustrates that Zolotarev's performance is 
worse than what we found for the polar decom- 
position. Even though Zolotarev's L s is much 
smaller, its number of iterations in the linear sys- 
tem solver explodes. 

This surprising behaviour is due to the fact 
that the convergence of the linear system solver 
degrades with increasing bi + c,. For Zolotarev 
there are always coefficients that are larger than 
the ones being used for the polar decomposition. 
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Figure 6. Residual mass, for smaller standard 
Shamir quark mass m — 0.02, with A/5 = 1.5, 
bi — Ci — 1.0 and hi + ci — 2.3, . . . , 2.6. The factor 
of improvement over standard Shamir grows from 
1.6 to 1.7, as compare to the m = 0.06 analysis. 
The optimal bi + a values is again 2.4, as for 
m = 0.06. 



Figure 7. Comparison of Zolotarev polynomial 
with polar decomposition. Scale = (bi + Cj) 
stands for a choice of the coefficients such that 
the Zolotarev polynomial, with L s = 10, overlaps 
with the polar decomposition at L s = 16 as well 
as possible (obviously, there is no overlap in the 
interval where the Zolotarev polynomial is flat). 



Even though there are smaller ones as well, the 
large ones are responsible for the slow conver- 
gence. 
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